Mixture of Markov Trees for Bayesian Network Structure Learning with Small Datasets in High Dimensional Space
نویسندگان
چکیده
The recent explosion of high dimensionality in datasets for several domains has posed a serious challenge to existing Bayesian network structure learning algorithms. Local search methods represent a solution in such spaces but suffer with small datasets. MMHC (MaxMin Hill-Climbing) is one of these local search algorithms where a first phase aims at identifying a possible skeleton by using some statistical association measurements and a second phase performs a greedy search restricted by this skeleton. We propose to replace the first phase, imprecise when the number of data remains relatively very small, by an application of ”Perturb and Combine” framework we have already studied in density estimation by using mixtures of bagged trees.
منابع مشابه
Learning Bayesian Network Structure using Markov Blanket in K2 Algorithm
A Bayesian network is a graphical model that represents a set of random variables and their causal relationship via a Directed Acyclic Graph (DAG). There are basically two methods used for learning Bayesian network: parameter-learning and structure-learning. One of the most effective structure-learning methods is K2 algorithm. Because the performance of the K2 algorithm depends on node...
متن کاملDetecting Overlapping Communities in Social Networks using Deep Learning
In network analysis, a community is typically considered of as a group of nodes with a great density of edges among themselves and a low density of edges relative to other network parts. Detecting a community structure is important in any network analysis task, especially for revealing patterns between specified nodes. There is a variety of approaches presented in the literature for overlapping...
متن کاملTowards sub-quadratic learning of probability density models in the form of mixtures of trees
We consider randomization schemes of the Chow-Liu algorithm from weak (bagging, of quadratic complexity) to strong ones (full random sampling, of linear complexity), for learning probability density models in the form of mixtures of Markov trees. Our empirical study on high-dimensional synthetic problems shows that, while bagging is the most accurate scheme on average, some of the stronger rand...
متن کاملBANFF: An R Package for BAyesian Network Feature Finder
Feature selection on high-dimensional networks plays an important role in understanding of biological mechanisms and disease pathologies. It has a broad range of applications. Recently, a Bayesian nonparametric mixture model (Zhao, Kang, and Yu 2014) has been successfully applied for selecting gene and gene sub-networks. We extend this method to a unified approach for feature selection on gener...
متن کاملBayesian Additive Regression Trees using Bayesian model averaging
Bayesian Additive Regression Trees (BART) is a statistical sum of trees model. It can be considered a Bayesian version of machine learning tree ensemble methods where the individual trees are the base learners. However for datasets where the number of variables p is large (e.g. p > 5, 000) the algorithm can become prohibitively expensive, computationally. Another method which is popular for hig...
متن کامل